mean field residual network
Reviews: Mean Field Residual Networks: On the Edge of Chaos
This paper analytically investigates the properties of Resnets with random weights using a mean field approximation. The approach used is an extension of previous analysis of feed forward neural networks. The authors show that in contrast to feed forward networks Resnets do exhibit a sub-exponential behavior (polynomial or logarithmic) when inputs are propagated forward or gradients are propagated backwards through the layers. The results are very interesting because they give an analytic justification and intuition of why Resnets with a large number of layers can be trained reliably. The paper also extends the mean field technique for studying neural network properties, which is of value for analyzing other architectures.
Mean Field Residual Networks: On the Edge of Chaos
We study randomly initialized residual networks using mean field theory and the theory of difference equations. We show, in contrast, that by adding skip connections, the network will, depending on the nonlinearity, adopt subexponential forward and backward dynamics, and in many cases in fact polynomial. The exponents of these polynomials are obtained through analytic methods and proved and verified empirically to be correct. In terms of the "edge of chaos" hypothesis, these subexponential and polynomial laws allow residual networks to "hover over the boundary between stability and chaos," thus preserving the geometry of the input space and the gradient information flow. In our experiments, for each activation function we study here, we initialize residual networks with different hyperparameters and train them on MNIST.